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We don’t speak for our 
employer. All the opinions 
and information here are of 
our responsibility (actually 
no one ever saw this talk 
before). 


So, mistakes and bad jokes are 
all 


OUR responsibilities 
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Introduction / Motivation 


> Number of new malware samples grows at an absurd pace 


> We still see words such as ‘many’ instead of the actual 
number of analyzed samples 


> Assumptions without concrete data supporting them 


> INDUSTRY-RELATED RESEARCH NEEDS RESULTS, THUS NOT 
PROMISING POINTS ARE NOT LOOKED AFTER 


Objectives 


> Demonstrate the possibility of in-depth large-scale 
malware analysis 


> Distribute and scale IDA Pro (with Decompiler) to 
leverage its functionalities for automated malware 
analysis 


> Share with the community the obtained results: 
v IDA Pro IDBs, plugins and scripts 
v Intermediate representation 
v MS Visual C++ reconstructed types 


v And more... : 


Methodology: Highlights 


> Analyzed 32-bit and x86-64-bit PE not-packed 
samples from public sources 


> No malware size limitations at all 


> Preference on MS Visual C++ samples because of 
HexRaysCodeXplorer OO types reconstruction 
feature 


> Details on the infrastructure already discussed 
in Black Hat Las Vegas 2012 presentation 


Methodology: Overview of the process 


Phase 1 


Collect 
samples 


Phase 2 


Extract 


information 


Phase 3 


Analyze and 
parse 
information 


Phase 4 


Generate 
statistics 
and charts 


Pre-process 
samples and 
collect millions 
of 32-bit and 
x86-64-bit not- 
packed PE 
malware 
samples 


Run different 
malware analysis 
algorithms on the 
collected samples 
and store results 
on the filesystem. 


Parse and 
structure the 
results. 


Generate 
Statistics and 
charts based on 
structured 
information. 


Methodology: Only static analysis 


> We only used static analysis 


> Not detectable by malware... unless it exploits 
the analysis environment! 


> Prone to anti-disassembly tricks 


> Has some limitations... but powerful tools and 
techniques are available 


> IDA Pro rocks!! © 


Methodology: Malware analysis algorithms 


> HexRaysCodeXplorer (by @REhints) used for: 


v Ctrees* for some IDA-recognized functions 


v MS Visual C++ object-oriented types REconstruction 
> Ctrees depth analysis 

v Highly-modified version of pathfinder by @devttyS@ 
> 00 “this” usage study 


> Crypto usage detection based on IdaScope by 
(push pnx 


* - ctrees is the intermediate representation in Hex-Rays decompiler 


Constraints and Limitations: 


Dumping Ctrees 


Iterate through recognized routines in idb 

Enumerate |° Process first 60 routines of size larger than 0x160 bytes 
routines Process first 30 crypto (using AES-NI) routines 

Process first 40 other functions bigger than 0x60 bytes 


e Decompile routine to get ctree (IR) 
Obtain IR | . serialize ctree to string 


e See implementation of 

Ctree ctree dumper t::filter citem() 
normalization |, use normalized ctree for 
comparison 


Constraints and Limitations: 


VTBL reconstruction algorithm 


Detect e Find all calls with “this” pointer to an offset 
etec within ".rdata"/".data" and data sections 


VTBL e Find all xrefs to virtual tables 


e Calculate size of virtual tables 


Recognize ο Recognize all virtual methods 


layout 


Add new |» create new structure for VTBL 
VTBL Type layout representation 


Constraints and Limitations : 


Complex types REconstruction algorithm 


Detect e Find pointers to possible type instances 
Type e Find initialization routine entry point 


e Find all references to possible type address 
Recognize Space 
Type e Find all xrefs to the attributes of the 
layout identified type 
e Reconstruct data flow for the identified type 


Add new ο Create new local type 
Type if it has more than 3 


definition attributes 


Constraints and Limitations: 


Ctrees Depth Analysis 


ο... code e Use breadth-first search algorithm 
xrefs to the e Limit: 100 nodes 


routine 


Get ο Distance from entry point 
« depth counter 
e number of xrefs 


statistics 


Constraints and Limitations: 


C++ "this" 


Scan entry 
point section 


usage study 


e Check up to 5000 call instructions 


Detect e Scan 5 instructions preceding the call 


“this” e Check ECX loads (“mov” and "lea") 
usage 


Gather e Compute percentage of calls 
statistics “loading” ecx 


Distributing IDA Pro: Highlights 


> Unexpected performance benefits on IDA because the 
information is structured 


v But we also came across some disadvantages: SDK is complex, 
function signatures change from version to version and is not 
fully documented 


> Good performance in commodity hardware 


> C-based plugins are usually not compatible with 
Linux/Mac 


v Portability efforts are required 


Distributing IDA Pro: Highlights 


> IDA plugins are usually not made to scale 


> Target single-sample analysis 
> Focus on users interacting with IDA Pro interface 


> Automated malware analysis exercises much more 
the internal plugin flows than manual analysis 


v As a result, corner cases and bugs were identified in many plugins 


including HexRaysCodeXplorer 
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VALIDATING THE METHODOLOGY AND TOOLSET 


ANALYSIS OF C++ TARGETED MALWARE 
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Animal Farm" Case Study 


> Discovered by CSEC 


350 Ρ erat 1 on Hel centr Sty ae — an ae 
SNOWGLOBE SNOWGLOBE. © 
° e CSEC assesses, with moderate certainty, 
> Samples : NBOT 3 SNOWGLOBE to be a state-sponsored CNO effort, put 
Dino F Ba ba r, forth by a French intelligence agency 


Bunny, Casper 


> Written in MS 
Visual C++ 


* - “Totally Spies”, Joan Calvet, Marion Marschalek, Paul Rascagnères, http:/recori.cx/201 5/slides/recon2015-01-joan-calvet-marion-marschalek-paul-rascagneres- Totally- 
Spies.pdf 


Animal Farm: Shared C++ Types 
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Animal Farm: Shared C++ Types 


6 shared 3 shared 3 shared 6 shared 
custom custom custom custom 


15 shared 
custom 


3 shared 
custom 
types 


Conclusions 


> We demonstrated that IDA Pro scale really well and all 
its powerful features can be used in automated malware 
analysis systems 


v CALL TO ACTION: IDA Pro plugin developers to start adding batch mode 
Switches and optimize the algorithms 


> Want to run your IDA plugin on millions of malwares? Let 
us know! © 
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Resources 


Presentation, code and instructions on how 


to download samples, IDBs and outputs will 
be available at: 


https://github. com/REhints/BlackHat_2015 


CodeXplorer v2.0 [BH Edition] 


> Finally plugin support Linux/Mac/Windows | m 
— SAVA A VAN 
IN MNA WAI NAAA 


> Options for analysis in IDA batch mode NAS ALLA 


V 7 V INA ML] 


> Multiple bug fixes and code review 
> Improvements for Types and VTBL’s reconstruction 


> New Features: 
v dump Ctrees information for additional analysis 


v dump all reconstructed types information 


https://github. com/REhints/HexRaysCodeXpLorer 
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The new RE book is coming soon! 


Rootkits 
and Bootkits 


Reversing Modern Malware and 
Next Generation Threats 


Alex Matrosov, Eugene Rodionov, 
and Sergey Bratus 


httos://www.nosta rch.com/rootkits 


THE END ! Really !? 
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